157 research outputs found
Symbiosis between the TRECVid benchmark and video libraries at the Netherlands Institute for Sound and Vision
Audiovisual archives are investing in large-scale digitisation efforts of their analogue holdings and, in parallel, ingesting an ever-increasing amount of born- digital files in their digital storage facilities. Digitisation opens up new access paradigms and boosted re-use of audiovisual content. Query-log analyses show the shortcomings of manual annotation, therefore archives are complementing these annotations by developing novel search engines that automatically extract information from both audio and the visual tracks. Over the past few years, the TRECVid benchmark has developed a novel relationship with the Netherlands Institute of Sound and Vision (NISV) which goes beyond the NISV just providing data and use cases to TRECVid. Prototype and demonstrator systems developed as part of TRECVid are set to become a key driver in improving the quality of search engines at the NISV and will ultimately help other audiovisual archives to offer more efficient and more fine-grained access to their collections. This paper reports the experiences of NISV in leveraging the activities of the TRECVid benchmark
Automatic summarization of rushes video using bipartite graphs
In this paper we present a new approach for automatic summarization of rushes, or unstructured video. Our approach is composed of three major steps. First, based on shot and sub-shot segmentations, we filter sub-shots with low information content not likely to be useful in a summary. Second, a method using maximal matching in a bipartite graph is adapted to measure similarity between the remaining shots and to minimize inter-shot redundancy by removing repetitive retake shots common in rushes video. Finally, the presence of faces and motion intensity are characterised in each sub-shot. A measure of how representative the sub-shot is in the context of the overall video is then proposed. Video summaries composed of keyframe slideshows are then generated. In order to evaluate the effectiveness of this approach we re-run the evaluation carried out by TRECVid, using the same dataset and evaluation metrics used in the TRECVid video summarization task in 2007 but with our own assessors. Results show that our approach leads to a significant improvement on our own work in terms of the fraction of the TRECVid summary ground truth included and is competitive with the best of other approaches in TRECVid 2007
Experiences of aiding autobiographical memory using the sensecam
Human memory is a dynamic system that makes accessible certain memories of events based on a hierarchy of information, arguably driven by personal significance. Not all events are remembered, but those that are tend to be more psychologically relevant. In contrast, lifelogging is the process of automatically recording aspects of one's life in digital form without loss of information. In this article we share our experiences in designing computer-based solutions to assist people review their visual lifelogs and address this contrast. The technical basis for our work is automatically segmenting visual lifelogs into events, allowing event similarity and event importance to be computed, ideas that are motivated by cognitive science considerations of how human memory works and can be assisted. Our work has been based on visual lifelogs gathered by dozens of people, some of them with collections spanning multiple years. In this review article we summarize a series of studies that have led to the development of a browser that is based on human memory systems and discuss the inherent tension in storing large amounts of data but making the most relevant material the most accessible
On Synergies Between Information Retrieval and Digital Libraries
In this paper we present the results of a longitudinal analysis of ACM SIGIR papers from 2003 to 2017. ACM SIGIR is the main venue where Information Retrieval (IR) research and innovative results are presented yearly; it is a highly competitive venue and only the best and most relevant works are accepted for publication. The analysis of ACM SIGIR papers gives us a unique opportunity to understand where the field is going and what are the most trending topics in information access and search.
In particular, we conduct this analysis with a focus on Digital Library (DL) topics to understand what is the relation between these two fields that we know to be closely linked. We see that DL provide document collections and challenging tasks to be addressed by the IR community and in turn exploit the latest advancements in IR to improve the offered services.
We also point to the role of public investments in the DL field as one of the core drivers of DL research which in turn may also have a positive effect on information accessing and searching in general
A framework for automatic semantic video annotation
The rapidly increasing quantity of publicly available videos has driven research into developing automatic tools for indexing, rating, searching and retrieval. Textual semantic representations, such as tagging, labelling and annotation, are often important factors in the process of indexing any video, because of their user-friendly way of representing the semantics appropriate for search and retrieval. Ideally, this annotation should be inspired by the human cognitive way of perceiving and of describing videos. The difference between the low-level visual contents and the corresponding human perception is referred to as the âsemantic gapâ. Tackling this gap is even harder in the case of unconstrained videos, mainly due to the lack of any previous information about the analyzed video on the one hand, and the huge amount of generic knowledge required on the other. This paper introduces a framework for the Automatic Semantic Annotation of unconstrained videos. The proposed framework utilizes two non-domain-specific layers: low-level visual similarity matching, and an annotation analysis that employs commonsense knowledgebases. Commonsense ontology is created by incorporating multiple-structured semantic relationships. Experiments and black-box tests are carried out on standard video databases for action recognition and video information retrieval. White-box tests examine the performance of the individual intermediate layers of the framework, and the evaluation of the results and the statistical analysis show that integrating visual similarity matching with commonsense semantic relationships provides an effective approach to automated video annotation
Behavior Discovery and Alignment of Articulated Object Classes from Unstructured Video
We propose an automatic system for organizing the content of a collection of
unstructured videos of an articulated object class (e.g. tiger, horse). By
exploiting the recurring motion patterns of the class across videos, our
system: 1) identifies its characteristic behaviors; and 2) recovers
pixel-to-pixel alignments across different instances. Our system can be useful
for organizing video collections for indexing and retrieval. Moreover, it can
be a platform for learning the appearance or behaviors of object classes from
Internet video. Traditional supervised techniques cannot exploit this wealth of
data directly, as they require a large amount of time-consuming manual
annotations.
The behavior discovery stage generates temporal video intervals, each
automatically trimmed to one instance of the discovered behavior, clustered by
type. It relies on our novel motion representation for articulated motion based
on the displacement of ordered pairs of trajectories (PoTs). The alignment
stage aligns hundreds of instances of the class to a great accuracy despite
considerable appearance variations (e.g. an adult tiger and a cub). It uses a
flexible Thin Plate Spline deformation model that can vary through time. We
carefully evaluate each step of our system on a new, fully annotated dataset.
On behavior discovery, we outperform the state-of-the-art Improved DTF
descriptor. On spatial alignment, we outperform the popular SIFT Flow
algorithm.Comment: 19 pages, 19 figure, 3 tables. arXiv admin note: substantial text
overlap with arXiv:1411.788
Test of lepton universality in decays
The first simultaneous test of muon-electron universality using
and decays is performed, in two ranges of the dilepton
invariant-mass squared, . The analysis uses beauty mesons produced in
proton-proton collisions collected with the LHCb detector between 2011 and
2018, corresponding to an integrated luminosity of 9 . Each
of the four lepton universality measurements reported is either the first in
the given interval or supersedes previous LHCb measurements. The
results are compatible with the predictions of the Standard Model.Comment: All figures and tables, along with any supplementary material and
additional information, are available at
https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2022-046.html (LHCb
public pages
- âŠ